Feature-based Model Selection for Time Series Forecasting

37th ISF Cairns, Australia

Thiyanga Talagala, Rob J Hyndman, George Athanasopoulos

28 June 2017

Methods to forecast time series

Large collections of time series

Forecasting demand for thousands of products across multiple warehouses

Google and Yahoo both collect millions of time series

  • Queries, revenue, number of users for many services (such as web search, YouTube, etc.)

Energy Consumption

Forecasting Multiple Time Series

  • Individual Model Building and Combined Forecasts

Forecasting Multiple Time Series

  • Individual Model Building and Combined Forecasts

Forecasting Multiple Time Series

  • Individual Model Building and Combined Forecasts

Forecasting Multiple Time Series

  • Aggregate selection rule

Forecasting Multiple Time Series

  • Aggregate selection rule
    • Develop a single method which provides better forecasts across all time series.

Forecasting Multiple Time Series

  • Aggregate selection rule
    • Develop a single method which provides better forecasts across all time series.
    • No free lunch!

Other Challenge…

  • Forecasting is often required by non-experts in the field of time series analysis.

Motivation

  • If the reasons for differences in performance of various forecasting methods are explored they may be useful in guiding the choice of the best forecasting method (Reid, 1972).

Motivation

  • If the reasons for differences in performance of various forecasting methods are explored they may be useful in guiding the choice of the best forecasting method (Reid, 1972).
  • Some methods perform better on some series than others could not be entirely due to chance (Makridakis et al., 1982).

Motivation

  • If the reasons for differences in performance of various forecasting methods are explored they may be useful in guiding the choice of the best forecasting method (Reid, 1972).

  • Some methods perform better on some series than others could not be entirely due to chance (Makridakis et al., 1982).

  • Whether the features of time series can be used to draw conclusions about which forecasting method will work best for forecasting their future values?

Let the data speak for themselves

What is this project about?

Propose a classification framework which selects forecast models based on features calculated from time series.

Methodology

Methodology - Random Forest

Methodology - Random Forest

Random Forest - Basic Algorithm for Classification

  • Let N be the number of trees to build.

Random Forest - Basic Algorithm for Classification

  • Let N be the number of trees to build.
  • For each N iteration
    • Select a new bootstrap sample from the training set.

Random Forest - Basic Algorithm for Classification

  • Let N be the number of trees to build.

  • For each N iteration
    • Select a new bootstrap sample from the training set.
    • Grow an un-pruned tree based on the bootstrap sample.

Random Forest - Basic Algorithm for Classification

  • Let N be the number of trees to build.

  • For each N iteration
    • Select a new bootstrap sample from the training set.
    • Grow an un-pruned tree based on the bootstrap sample.
    • At each node, select m variables at random from the p variables.

Random Forest - Basic Algorithm for Classification

  • Let N be the number of trees to build.

  • For each N iteration
    • Select a new bootstrap sample from the training set.
    • Grow an un-pruned tree based on the bootstrap sample.
    • At each node, select m variables at random from the p variables.
    • select the best split-point among the m.

Random Forest - Basic Algorithm for Classification

  • Let N be the number of trees to build.

  • For each N iteration
    • Select a new bootstrap sample from the training set.
    • Grow an un-pruned tree based on the bootstrap sample.
    • At each node, select m variables at random from the p variables.
    • select the best split-point among the m.
  • Overall prediction: Majority vote from all individually built trees.

Methodology - Simulate Time Series

Approach 1

  • AR(1)
  • AR(2)
  • MA(1)
  • MA(2)
  • ARMA(1,1)
  • ARMA(2,2)
  • ARIMA(1,1,1)
  • ARIMA(1,1,0)
  • ARIMA(2,2,2)
  • ARIMA(1,2,1)
  • ARIMA(1,2,0)
  • ARIMA(0,2,1)
  • ANN
  • AAN
  • MNN
  • MAN
  • MAdN
  • AAdN
  • MMN
  • MMdN
  • Random walk
  • Random walk with drift
  • White noise process

Approach 2

Approach 3

Methodology - Time Series Features

Methodology - Time Series Features

  • Strength of trend
  • Spectral Entropy
  • Hurst Exponent
  • Lyapunov Exponent
  • Differencing order
  • Parameter estimates of Holt linear trend model
  • Box-Pierce statistic
  • Length
  • Coefficient of determination of the linear trend model
  • First auto-correlation coefficient of the residual series of the linear trend model
  • ACF and PACF based features

ACF and PACF based features

ACF based
  • First auto-correlation coefficient

  • Sum of squares of first 10 auto-correlation coefficients

  • Number of significant spikes (out of first five) in the ACF

PACF based
  • Sum of squares of first 5 partial auto-correlation coefficients

  • Number of significant spikes (out of first five) in the PACF

Methodology - Random Forest

Preliminary Study

Preliminary Study

  • Limit the analysis to non-seasonal time series.

Preliminary Study

  • Limit the analysis to non-seasonal time series.
  • Yearly data of M3 competition.

Preliminary Study

  • Limit the analysis to non-seasonal time series.

  • Yearly data of M3 competition.

  • Develop a random forest classifier to select the most appropriate forecasting model for yearly data of M3 competition.

Preliminary Study

  • Limit the analysis to non-seasonal time series.

  • Yearly data of M3 competition.

  • Develop a random forest classifier to select the most appropriate forecasting model for yearly data of M3 competition.

  • All models are selected using the training sets and model evaluation is done by using the test set.

Preliminary Study

  • Limit the analysis to non-seasonal time series.

  • Yearly data of M3 competition.

  • Develop a random forest classifier to select the most appropriate forecasting model for yearly data of M3 competition.

  • All models are selected using the training sets and model evaluation is done by using the test set.

  • The accuracy of our method is compared against several benchmarks and other commonly used approaches of forecasting.

Results

Distribution of MASE

RF1.84auto.arima1.86ets2.01WN6.40RW2.27RD1.92010203040
MASE

Model Composition

What next?

What next?

  • Develop a more comprehensive set of features that are useful in identifying different data generating processes.

What next?

  • Develop a more comprehensive set of features that are useful in identifying different data generating processes.
  • Extend the time series collection to non-seasonal data.

What next?

  • Develop a more comprehensive set of features that are useful in identifying different data generating processes.

  • Extend the time series collection to non-seasonal data.

  • Test for several large scale real time series data sets.

What next?

  • Develop a more comprehensive set of features that are useful in identifying different data generating processes.

  • Extend the time series collection to non-seasonal data.

  • Test for several large scale real time series data sets.

  • Consider other classification methods.

Thank You